Goto

Collaborating Authors

 meta platform


Generative Early Stage Ranking

Hong, Juhee, Liu, Meng, Wang, Shengzhi, Mao, Xiaoheng, Cheng, Huihui, Gao, Leon, Leung, Christopher, Zhou, Jin, Sekar, Chandra Mouli, Zhu, Zhao, Liu, Ruochen, Trieu, Tuan, Sun, Dawei, Kanjani, Jeet, Li, Rui, Qian, Jing, Cao, Xuan, Fan, Minjie, Gao, Mingze

arXiv.org Artificial Intelligence

Large-scale recommendations commonly adopt a multi-stage cascading ranking system paradigm to balance effectiveness and efficiency. Early Stage Ranking (ESR) systems utilize the "user-item decoupling" approach, where independently learned user and item representations are only combined at the final layer. While efficient, this design is limited in effectiveness, as it struggles to capture fine-grained user-item affinities and cross-signals. To address these, we propose the Generative Early Stage Ranking (GESR) paradigm, introducing the Mixture of Attention (MoA) module which leverages diverse attention mechanisms to bridge the effectiveness gap: the Hard Matching Attention (HMA) module encodes explicit cross-signals by computing raw match counts between user and item features; the Target-Aware Self Attention module generates target-aware user representations conditioned on the item, enabling more personalized learning; and the Cross Attention modules facilitate early and more enriched interactions between user-item features. MoA's specialized attention encodings are further refined in the final layer through a Multi-Logit Parameterized Gating (MLPG) module, which integrates the newly learned embeddings via gating and produces secondary logits that are fused with the primary logit. To address the efficiency and latency challenges, we have introduced a comprehensive suite of optimization techniques. These span from custom kernels that maximize the capabilities of the latest hardware to efficient serving solutions powered by caching mechanisms. The proposed GESR paradigm has shown substantial improvements in topline metrics, engagement, and consumption tasks, as validated by both offline and online experiments. To the best of our knowledge, this marks the first successful deployment of full target-aware attention sequence modeling within an ESR stage at such a scale.


Deepfake political scam ads surge on Meta platforms, watchdog says

The Japan Times

According to Meta's rules, advertisers who seek to run political ads in the United States have to undergo a special authorization process. Washington - Scammers are among the top political ad spenders on Meta's platforms, using deepfake videos of American politicians -- including President Donald Trump -- to promote fake government benefits, a watchdog group said Wednesday. The nonprofit Tech Transparency Project said it identified 63 scam advertisers that collectively spent $49 million on Facebook and Instagram, often targeting seniors with ads promoting fake stimulus checks, government spending cards and healthcare payments. The ads have reached tens of thousands of the platforms' users. In a time of both misinformation and too much information, quality journalism is more crucial than ever.


FaMA: LLM-Empowered Agentic Assistant for Consumer-to-Consumer Marketplace

Yan, Yineng, Wang, Xidong, Cheng, Jin Seng, Hu, Ran, Guan, Wentao, Farahmand, Nahid, Lin, Hengte, Li, Yue

arXiv.org Artificial Intelligence

The emergence of agentic AI, powered by Large Language Models (LLMs), marks a paradigm shift from reactive generative systems to proactive, goal-oriented autonomous agents capable of sophisticated planning, memory, and tool use. This evolution presents a novel opportunity to address long-standing challenges in complex digital environments. Core tasks on Consumer-to-Consumer (C2C) e-commerce platforms often require users to navigate complex Graphical User Interfaces (GUIs), making the experience time-consuming for both buyers and sellers. This paper introduces a novel approach to simplify these interactions through an LLM-powered agentic assistant. This agent functions as a new, conversational entry point to the marketplace, shifting the primary interaction model from a complex GUI to an intuitive AI agent. By interpreting natural language commands, the agent automates key high-friction workflows. For sellers, this includes simplified updating and renewal of listings, and the ability to send bulk messages. For buyers, the agent facilitates a more efficient product discovery process through conversational search. We present the architecture for Facebook Marketplace Assistant (FaMA), arguing that this agentic, conversational paradigm provides a lightweight and more accessible alternative to traditional app interfaces, allowing users to manage their marketplace activities with greater efficiency. Experiments show FaMA achieves a 98% task success rate on solving complex tasks on the marketplace and enables up to a 2x speedup on interaction time.


RADAR: Recall Augmentation through Deferred Asynchronous Retrieval

Jaspal, Amit, Dang, Qian, Ramineni, Ajantha

arXiv.org Artificial Intelligence

M odern large - scale recommender systems employ multi - stage ranking funnel (Retrieval, Pre - ranking, Ranking) to balance engagement and computational constraints (latency, CPU). However, the initial retrieval stage, often relying on efficient but less precise methods like K - Nearest Neighbors (KNN), struggles to effectively surface the most engaging items from billion - scale catalogs, particularly distinguishing highly relevant and engaging candidates from merely relevant ones. We introduce Recall Augmentation through Deferred Asynchronous Retrieval ( RADAR), a novel framework that leverages asynchronous, offline computation to pre - rank a significantly larger candidate set for users using the full complexity ranking model. These top - ranked items are stored and utilized as a high - quality retrieval source during online inference, bypassing online retrieval and pre - ranking stages for these candidates. We demonstrate through offline experiments that RADAR significantly boosts recall ( 2 X Recall @200 vs DNN retrieval baseline) by effectively combining a larger retrieved candidate set with a more powerful ranking model. Online A/B tests confirm a +0.8% lift in topline engagement metrics, validating RADAR as a practical and effective method to improve recommendation quality under strict online serving constraints.


DV365: Extremely Long User History Modeling at Instagram

Lyu, Wenhan, Tyagi, Devashish, Yang, Yihang, Li, Ziwei, Somani, Ajay, Shanmugasundaram, Karthikeyan, Andrejevic, Nikola, Adeputra, Ferdi, Zeng, Curtis, Singh, Arun K., Ransan, Maxime, Jain, Sagar

arXiv.org Artificial Intelligence

Long user history is highly valuable signal for recommendation systems, but effectively incorporating it often comes with high cost in terms of data center power consumption and GPU. In this work, we chose offline embedding over end-to-end sequence length optimization methods to enable extremely long user sequence modeling as a cost-effective solution, and propose a new user embedding learning strategy, multi-slicing and summarization, that generates highly generalizable user representation of user's long-term stable interest. History length we encoded in this embedding is up to 70,000 and on average 40,000. This embedding, named as DV365, is proven highly incremental on top of advanced attentive user sequence models deployed in Instagram. Produced by a single upstream foundational model, it is launched in 15 different models across Instagram and Threads with significant impact, and has been production battle-proven for >1 year since our first launch.


Personalized Interpolation: An Efficient Method to Tame Flexible Optimization Window Estimation

Zhang, Xin, Li, Weiliang, Li, Rui, Fu, Zihang, Tang, Tongyi, Zhang, Zhengyu, Chen, Wen-Yen, Noorshams, Nima, Jasapara, Nirav, Ding, Xiaowen, Wen, Ellie, Feng, Xue

arXiv.org Artificial Intelligence

In the realm of online advertising, optimizing conversions is crucial for delivering relevant products to users and enhancing business outcomes. Predicting conversion events is challenging due to variable delays between user interactions, such as impressions or clicks, and the actual conversions. These delays differ significantly across various advertisers and products, necessitating distinct optimization time windows for targeted conversions. To address this, we introduce a novel approach named the \textit{Personalized Interpolation} method, which innovatively builds upon existing fixed conversion window models to estimate flexible conversion windows. This method allows for the accurate estimation of conversions across a variety of delay ranges, thus meeting the diverse needs of advertisers without increasing system complexity. To validate the efficacy of our proposed method, we conducted comprehensive experiments using ads conversion model. Our experiments demonstrate that this method not only achieves high prediction accuracy but also does so more efficiently than other existing solutions. This validation underscores the potential of our Personalized Interpolation method to significantly enhance conversion optimization in real-world online advertising systems, promising improved targeting and effectiveness in advertising strategies.


US tech stocks send Nasdaq to hit record high, as Alphabet beats forecasts

Al Jazeera

US tech stocks have catapulted the Nasdaq to a record high as investors bet on strong earnings from corporate heavyweights. The tech-heavy Nasdaq Composite Index rose 0.8 percent on Tuesday, as Google's parent company Alphabet reported forecast-beating earnings for the third quarter. Alphabet's revenue jumped 15 percent to 88.3bn during the July-September period, while profit surged 34 percent to 26.3 bn. Google and Alphabet CEO Sundar Pichai said the company was experiencing "extraordinary" momentum due to the strong performance of its search and cloud businesses as well as its focus on innovation, including artificial intelligence. "Our commitment to innovation, as well as our long-term focus and investment in AI, are paying off and driving success for the company and for our customers," Pichai said on an earnings call.


Ads Supply Personalization via Doubly Robust Learning

Shi, Wei, Fu, Chen, Xu, Qi, Chen, Sanjian, Zhang, Jizhe, Zhu, Qinqin, Hua, Zhigang, Yang, Shuang

arXiv.org Artificial Intelligence

Ads supply personalization aims to balance the revenue and user engagement, two long-term objectives in social media ads, by tailoring the ad quantity and density. In the industry-scale system, the challenge for ads supply lies in modeling the counterfactual effects of a conservative supply treatment (e.g., a small density change) over an extended duration. In this paper, we present a streamlined framework for personalized ad supply. This framework optimally utilizes information from data collection policies through the doubly robust learning. Consequently, it significantly improves the accuracy of long-term treatment effect estimates. Additionally, its low-complexity design not only results in computational cost savings compared to existing methods, but also makes it scalable for billion-scale applications. Through both offline experiments and online production tests, the framework consistently demonstrated significant improvements in top-line business metrics over months. The framework has been fully deployed to live traffic in one of the world's largest social media platforms.


Ads for Explicit 'AI Girlfriends' Are Swarming Facebook and Instagram

WIRED

Meta's online ad library shows the company is hosting thousands of ads for AI-generated, NSFW companion or "girlfriend" apps on Facebook, Instagram, and Messenger. They promote chatbots offering sexually explicit images and text, using NSFW chat samples and AI images of partially clothed, unbelievably shaped, simulated women. Many of the virtual women seen in ads reviewed by WIRED are lifelike--if somewhat uncanny--young, and stereotypically pornographic. Prospective customers are invited to role-play with an AI "stepmom," connect with a computer-generated teen in a hijab, or chat with avatars who promise to "get you off in one minute." The ads appear to be thriving despite Meta's ad policies clearly barring "adult content," including "depictions of people in explicit or suggestive positions, or activities that are overly suggestive or sexually provocative."


Multi-line AI-assisted Code Authoring

Dunay, Omer, Cheng, Daniel, Tait, Adam, Thakkar, Parth, Rigby, Peter C, Chiu, Andy, Ahmad, Imad, Ganesan, Arun, Maddila, Chandra, Murali, Vijayaraghavan, Tayyebi, Ali, Nagappan, Nachiappan

arXiv.org Artificial Intelligence

CodeCompose is an AI-assisted code authoring tool powered by large language models (LLMs) that provides inline suggestions to 10's of thousands of developers at Meta. In this paper, we present how we scaled the product from displaying single-line suggestions to multi-line suggestions. This evolution required us to overcome several unique challenges in improving the usability of these suggestions for developers. First, we discuss how multi-line suggestions can have a 'jarring' effect, as the LLM's suggestions constantly move around the developer's existing code, which would otherwise result in decreased productivity and satisfaction. Second, multi-line suggestions take significantly longer to generate; hence we present several innovative investments we made to reduce the perceived latency for users. These model-hosting optimizations sped up multi-line suggestion latency by 2.5x. Finally, we conduct experiments on 10's of thousands of engineers to understand how multi-line suggestions impact the user experience and contrast this with single-line suggestions. Our experiments reveal that (i) multi-line suggestions account for 42% of total characters accepted (despite only accounting for 16% for displayed suggestions) (ii) multi-line suggestions almost doubled the percentage of keystrokes saved for users from 9% to 17%. Multi-line CodeCompose has been rolled out to all engineers at Meta, and less than 1% of engineers have opted out of multi-line suggestions.